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Abstract 

The series of mean daily temperature of air recorded over a period 
of 215 years is used for analysing the dimensionality and the predicta- 
bility of the atmospheric system. The total number of data points of 
the series is 78527. Other 37 versions of the original scries arc gener- 
ated, including "seasonally adjusted" data, a smoothed series, series 
without annual course, etc. 

Modified methods of Grassberger &; Procaccia are applied. A pro- 
cedure for selection of the "meaningful" scaling region is proposed. 
Several scaling regions are revealed in the In C(r) versus Inr diagram. 
The first one in the range of larger Inr has a gradual slope and the 
second one in the range of intermediate Inr has a fast slope. Other 
two regions are settled in the range of small Inr. The results lead 
us to claim that the series arises from the activity of at least two 
subsystems. The first subsystem is low-dimensional (d/ = 1.6) and it 
possesses the potential predictability of several weeks. We suggest that 
this subsystem is connected with seasonal variability of weather. The 
second subsystem is high-dimensional {df > 17) and its error-doubling 
time is about 4-7 days. 

It is found that the predictability differs in dependence on season. 
The predictability time for summer, winter and the entire year (T2 ~ 
4.7 days) is longer than for transition-seasons {T2 ~ 4.0 days for spring, 
T2 ~ 3.6 days for autumn). 

The role of random noise and the number of data points are dis- 
cussed. It is shown that a 15-year-long daily temperature series is not 
sufficient for reliable estimations based on Grassberger &; Procaccia 
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algorithms. 

1 Introduction 

One often observes phenomena which exhibit comphcated nonperiodic be- 
haviour though they are controlled by strictly deterministic rules. Such pro- 
cesses have also been recorded in the last twenty years in computational sim- 
ulations of many nonlinear models from various areas. The algorithms did 
not contain any stochastic terms and/or numerical stability conditions were 
not violated at the same time. Systems with similar behaviour were already 
known in the last century, but the introduction of concept of the strange 
attractor and the use of efficient computers have led to better understanding 
of these phenomena. 

1963 is regarded as the beginning of the theory of deterministic chaos 
which studies such systems. In that year the well-known Lorenz's article con- 
cerning some stage of atmospheric convection was published. Lorenz [20] has 
designed system of three nonlinear differential equations whose solutions ex- 
hibit a chaotic evolution in time for particular values of control parameters. 
The Lorenz system has provided the first example of deterministic dissipative 
system with sensitive dependence on initial conditions. Since then the deter- 
ministic chaos has been detected in many areas, for example in geophysics, 
chemistry, biology, medicine and psychology, ecological systems, sociology, in 
rail vehicle dynamics, etc. (see [22]). 
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2 Dimension 



Dissipative systems with chaotic behaviour often possess a strange attractor. 
One of the principal characteristics which is used for description of strange at- 
tr actors is a dimension. The dimension reflects the complexity and strangeness 
of an attractor. Chaotic attr actors have a noninteger dimension. Integer part 
of a fractal dimension df plus one is the minimal number of independent 
variables needed to describe the time evolution of the system. The maximal 
number of such variables is2df+l for a simple system. There exists a variety 
of dimension definitions. The simplest one is the capacity dc of the attractor 



where M(r) is the minimal number of n-dimensional cubes of side r needed 
to cover the attractor A. More complicated one is the Hausdorff dimension 
dn (see [6], [7]). Generally dc > dn but it is conjectured that these dimen- 
sions are the same for typical attractors. Their common value is called the 
fractal dimension df [22]. It must be noted that roughly ten different fractal 
dimensions are used for a description of fractal sets [4] . 

The above definition of the capacity is based on metric properties of 
the attractor. Other definitions take into account the frequency of visiting 
individual parts of the attractor by the trajectory. An information dimension 
di and a correlation dimension ^2 belong among most used dimensions. Both 
dimensions are easily obtained from a generalized dimension of order q 
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where pi = Ni/N is the probabihty that a point on the attactor falls into the 
i-th cube of side r, N is the number of the all points on the attractor and Nj, is 
the number of points in the i-th cube. One obtains the fractal dimension df, 
the information dimension di and the correlation dimension d2 for q tending 
to 0, 1 and 2, respectively. Generally df = > di > d2 ■ ■ ■, except the case 
of uniform distribution of points on the attractor. 

3 Entropy 

The Kolmogorov K-entropy is another important characteristic which de- 
scribes a degree of chaoticity of the system. The entropy gives the average 
rate of information loss about a position of the phase point on the attractor. 
It is well-known that 

• K = in an ordered system 

• is infinite in a random system 

• 0<ir<ooina chaotic (deterministic) system 

The generalized entropy Kg can be defined like the generalized dimension. Let 
Y(i), t > he the trajectory of the dynamical system in an n-dimensional 
phase space sampled at discrete time intervals At. Let us divide the phase 
space into the n-dimcnsional hypcrcubes of side r. Let Pi-^i,^...i^ be the joint 
probability that the trajectory Y{t) on the attractor subsequently visits 
cubes ii, i2, • • •, in at times t — At, t — 2At, ■ ■ •, NAt. The generahzed 
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entropy is then 

Kg — — lim lim lim — \ ^ — In P.^,- . (3) 

One obtains the topological entropy Xq, the Kolmogorov entropy K and 
the ii'2-entropy for q tending to 0, 1 and 2, respectively. The i^'2-entropy is 
a lower bound for the Kolmogorov entropy and it is used as its estimate in 
most cases because the ir2-entropy is easily extracted from an experimental 
measurement [17]. 

There are other invariants used for the description of the chaotic signals: 
the spectrum of Lyapunov exponents [34], [42], [43], the Lyapunov dimension 
[7], [22], the Q-dimension [15], etc. 

4 Estimating from experimental data 
4.1 Phase space reconstruction 

One of frequent problems in experimental practice is that one has only 
a scalar signal (generated by the dynamical system) and no governing equa- 
tions. The scalars are for example temperature or pressure time series. There- 
fore the attractor has to be reconstructed in an artificial phase space. A method 
of time delay coordinates has been suggested by Takens [37] for this purpose. 
The main idea of this procedure is based on the fact that the phase space vari- 
able contains information about remaining phase space variables. The 
m-dimensional signal Y(ij) is composed of the scalar series x{f.i) measured 
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at constant sampling time intervals At = t^+i — ti as follows 

Y{U) = {x{U), x{ti + r), . . . , x{ti + (m - l)r)} , (4) 

where r is an appropriate time delay (which is an integer multiple of the 
sampling time At) and m is an embedding dimension. 

An important question is then how to choose the embedding dimension 
and the time delay. The procedure of finding the embedding dimension m is 
to increase m and to compute the fractal dimension (or another invariant) 
for every embedding dimension until the fractal dimension remains almost 
constant. This value of the fractal dimension of the reconstructed attractor is 
considered to be equal to the one of the original attractor. Usually m >2df 
is sufficient [44] . 

However, the above method is awkward in the case of a high-dimensional 
system and/or in the case of a large time series. In addition, the determi- 
nation of the constant level of the embedding dimension is subjective and 
depends on quality of the experimental data. Therefore Broomhead & King 
[3] have suggested another method for the choice of the embedding dimen- 
sion based on a singular value decomposition. Recently Breedon & Packard 
[2] have designed fuzzy delay coordinate reconstruction for the data sampled 
nonuniformly in time. 

For an infinite amount of noise-free data, the time delay r can be chosen 
almost arbitrarily. However, when the data are noisy and limited in number 
there is uncertainity similar to the one in the choice of the embedding di- 
mension. The choice of the time delay should guarantee an independence of 
the artificial phase space coordinates. The time delay is usually selected with 
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respect to an autocorrelation function of the data. Different authors have de- 
fined the time delay as the lag at which the autocorrelation function attains 
a certain value. For instance, 

• 1/e (Zeng et al. [44]) 

• 1/10 (Tsonis & Eisner [38]) 

• (Tsonis et al. [41]). 

Other authors take into account the embedding dimension m and/or a dom- 
inant periodicity T of the signal: 

• T/m (Tsonis et al. [41]) 

• w/{m — 1), where w is the first local minimum of the autocorrelation 
function or the first lag for which the autocorrelation function passes 
through zero (Poveda & Puente [29]). 

However, the autocorrelation function measures only a linear depen- 
dence. Therefore Fraser & Swinney [14] have suggested to choose the time 
delay by so called mutual information which measures the general depen- 
dence. But this method is not yet widely used. 

4.2 Estimating the dimension 

The information and the correlation dimensions are the most commonly com- 
puted invariants. Given the m-dimensional signal (§), one defines a local 
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correlation integral C^ii^r) as follows 

-1 M 

C"'i'^^) = W^ ^ ^^(r-||Y,-Y,||), (5) 
J = l 

where M — N — {m — 1)t and N is the number of data points in the original 
one-dimensional signal. 9{x) is a Heaviside step function defined by 

{1 for a; > 
(6) 
for x < 0. 

II ... II denotes an appropriate norm in the phase space, usually the Euclidean 
one. 

= „n, Ito (7) 

r— >0 M— >oo In r 

Experimentally, di may be obtained as the slope of the linear part of the curve 
In C"* versus In r. The result should be independent of i for a sufficiently large 
M. Eckman & Ruelle [8] have discussed the cases when this approach may 
fail. 

To improve the statistics one may average di{i) over several i. Grass- 
berger & Procaccia [16] have used averaging over all i and they have obtained 
the correlation dimension 

, lnC"^(r) 

d2 = hm hm — - — (8) 

r— ►0 M— >oo In r 

where C'^{r) is the correlation integral defined as folows 

^>'=M(Ar3T) i; »('-||Y.-Y,l|). (9) 

i 7^ J 
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In practice d2 is obtained by plotting In C""(r) versus Inr and deter- 
mining the slope of the curve between r^j„ and r^ax- For r less than r^j„ 
there is poor statistics due to sparseness of points and for r greater than 
Tjnax nonlinear effects deviate the dependence from the straight line. Only 
the interval {rmini^rnax) is "meaningful". No theoretical prescription exists 
for selecting the bounds Tmin and Tmax- It is left to researcher's judgement. 
An inappropriate selection may substantially devalue the result. The mean- 
ingful scaling region may be masked if the number of data points N is not 
sufficiently large and/or if the embedding dimension m exceeds a critical em- 
bedding dimension rrif. [9], [39]. We will show below that a faulty conclusion 
may be obtained if the time series is too small, although the scaling region is 
evident. An increase of the number of points in the time series by means of 
interpolation does not help and produces a spurious slope at small In r [33] . 

The question of how many points are enough is a widely discussed prob- 
lem [9], [10], [12], [24], [33], [35], [41], [44]. Many formulae determining a mini- 
mal number of data points A^^m for a reliable estimate have been established. 
However, it has been recently demonstrated that some of these formulae are 
much strict or erroneous [33], [35]. Nerenberg & Essex [24] have concluded 
that 

v/2^r (m/2 + 1) [ 2 (A; - 1) r [(m + 4) /2] ] m + 2 
(^lnifc)("+^)/M[r(l/2)]^r[(m + 3)/2]J 2 ' ^'"^ 

where T{x) is the gamma function, k = rmax/Tmin and A is permitted error 
of the estimation. This formula has been derived under preconception that 
m < c?2. Tsonis et al. [41] have shown that a satisfactory estimate of the 
correlation dimension of the Henon map can be obtained in the embedding 
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dimension m = 8 using 2000 data points only. It is much smaller number of 
points than the formula (jl^) indicates. Therefore the need for data has not 
to be the same for m > d2 as for m < d2- The minimal number of data points 
Nmin would depend on the type of the attractor [41]. 

It was mentioned above that di > d2 and that the information dimension 
is the lower bound for the fractal dimension. The inequality occurs in the limit 
case M — ^ oo in the formulae (|^) and (Q) but it is reasonable to assume that 
the equality holds good for the data limited in number [8]. Let us therefore 
consider them to be mutually equal and refer the common value as the fractal 
dimension df henceforth. 



4.3 Estimating the entropy 

Let us pay attention to estimating the entropy. Grassberger & Procaccia [17] 
have proposed that 



where 



K2 ~ lim limir™(r), (11) 



and k is a sufficiently small integer number. In practice one can not satisfy 
the limits in the above formulae. Therefore the saturation value of as m 
increases is regarded as K2. An extrapolation formula for m —>■ (yo has been 
used in the original paper of Grassberger & Procaccia [17] but its exphcit 
form has not been mentioned. The use of a maximum norm instead of the 
Euclidean norm in the equations (§) and (|^) leads to improvement of the 
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i^r™ convergence but an anisotropy of the maximum norm may cause an 
underestimation of the correlation dimension [13]. 

There is a certain degree of uncertainty in estimating the dimension and 
the entropy from experimental time series. The algorithms are usually reliable 
when dealing with artificial time scries generated by means of a computer. 
However, the results need a very careful interpretation when dealing with the 
real measured data containing inherent noise and limited in number. 

Moreover, the determination of a finite noninteger value of the fractal 
dimension and/or a finite value of the ir2-entropy greater than zero is not 
sufficient to claim that the system has a strange attractor. Osborne et al. 
[26], Osborne & Provenzale [27], Provenzale et al. [30] and Provenzale et al. 
[31] have presented a class of random fractal (coloured) noise with the finite 
noninteger correlation dimension and the converging ir2-entropy estimates. 
Such noise is characterized by a power-law power spectrum 

P{u;k) ~ 1< a < 3 (13) 

and its correlation dimension is 



Therefore necessity to distinguish between this class of noise and determin- 
istic dynamics arises in the analysis of experimental data. Some methods 
have been suggested by Provenzale et al. [31], Pavlos et al. [28] and Tsonis 
& Eisner [40]. The tests are based on generating appropriate surrogate series 
and comparing their characteristics with the ones of the original series. 
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5 Strange attractor in the atmosphere 

Application of the deterministic chaos theory to atmospheric phenomena 
has been motivated by the attempt to reveal their possible low-dimensional 
nature and reduce the number of variables needed for describing the phe- 
nonema. Indeed, the first studies referred to low-dimensional {df between 3 
and 8) weather and chmate attractors [9], [19], [23], [25], [32], [38]. Later 
these results have been criticized by other authors [33], [44]. They have ar- 
gued by considerable complexity of the atmosphere. Therefore they have not 
believed in the low-dimensional character of the atmosphere. The critics have 
considered the low estimates of the dimension as a consequence of the limited 
number of data points. Recently published papers have not thrown light on 
the problem. They have reported upon the low- dimensionality [1] on the one 
hand and upon the high-dimensionality [12] of the atmosphere on the other 
hand. 

Lorenz [21] has presented an imaginative explanation. He has shown that 
a meaningful estimate is possible by using only a relatively small number of 
data points if a variable strongly coupled with the rest of the variables of the 
system is used. However, he has shown that the fractal dimension estimate 
is undervalued if a weakly coupled variable is used. In this case rather the 
dimension of a subsystem is measured. 

Note that the existence of an attractor is presupposed a priori in the 
methods described above. This preconception is reasonable in the case of 
the atmosphere because it is hard to imagine that the weather is completely 
governed by some kind of randomness. 
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5.1 Used data and methods 

The data utilized in our experiment include mean daily temperature of air 
over a period of 215 years (1 January 1775 - 31 December 1989) recorded at 
the Klementinum Observatory, Prague, Czech Republic. The total number 
of data points is 78527 which is the largest sample measured at one mete- 
orological station, used for such an analysis. The reader may have doubts 
about the quality of the data, especially at the beginning of the observation. 
However, Hlavac [18] has paid extensive attention to homogenization of the 
series, especially with the respect to accuracy of the measurement in the 18th 
and the 19th centuries. 

In addition to the entire temperature time series, different modifications 
of it are explored. They are cut versions, a series without annual course and 
its cut versions, a smoothed series, a time-differenced series and "seasonally 
adjusted" series. Together it gives 38 versions of the original series. 

Because of the fact that the sample is very extensive the correlation 
integral (|) is replaced by 

1 I M 

C^"^W = 7P^E E %-||Y,:-Y,||), J = 250 (15) 

J = l 

in order to reduce a heavy computational burden. The selection of i is led by 
the idea of roughly uniform distribution of indices i between 1 and M: 

p = 0,l,...,/-l, (16) 

where [x] denotes the integer part of x. 

14 



J + l 



The fractal dimension df is estimated from the straight hne slope fit- 
ted to the "meaningful" range of the plot In C"^(r) versus Inr. The fitting 
is carried out by the least-squares regression in the interval < lnri,lnr„ >, 
In rmin < Inri < lnr„ < Inr^ax- In order to objectify the selection of the ap- 
propriate scaling region the bounds Inri and lnr„ are determinated so that 
the term 



i E (in Tk - In r) (in - In ) 



^ (in Tk - In r) J E^ (in C,- - In Cf) 



(17) 



where 



lnr = -$:inr,, In C,- = - E l^^^r, 



(18) 



k=l " k=l 

is maximal. This means that the estimate is done in that coherent part of the 
plot in which the absolute value of the hnear regression coefficient r'"(ri, r„) is 
maximal in every embedding dimension. The restriction n > 15 is introduced 
for the scaling region in order to be sufficiently large. 

The aforementioned method was tested by Henon map and gave very 
promising results. We obtained df — 1.21 for m — 10, N — 5000. Results 
were more precise for lower embedding dimensions {df — 1.25 for m — 4, 
= 5000). 



5.2 Estimating the dimension of the chmate attract or 

The fractal dimension estimates are performed for the following series: the 
mean daily temperature series of air recorded from 1 January, 1775 until 31 
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December, 1989 (hereafter the original series), fifteen consecutive 15-year- 
long versions of the original series (hereafter the cut series of the original 
one), the original series without annual course (hereafter the filtered series), 
fifteen consecutive 15-year-long versions of the filtered series (hereafter the 
cut scries of the filtered one), a spring scries, a summer scries, an autumn 
series, a winter series, the original scries smoothed by five-day moving av- 
erages (hereafter the smoothed series), a first time-difference of the original 
series (hereafter differenced series). 

In Figure 1 one may see the course of the autocorrelation functions for 
the individual series. Table 1 shows the lag at which the autocorrelation 
functions for the first time attain the values 1/e, 1/10 and 0. The threshold 
values of 1/e as the time delay r is used in this study. 

5.2.1 Original series 

Figures 2a and 2b present the InC-lnr diagram for different values of the 
embedding dimension. One can see two linear parts for the embedding dimen- 
sion greater than 4. There is the scahng region (1) at an intermediate part of 
Inr. The slope of this region increases as the embedding dimension increases 
and for m = 50 the slope attains the value of 17.5 ±0.3. The error represents 
the 90-% confidence interval of the linear least-squares fit. The second linear 
part (denoted as (2)) is evident for a larger Inr earlier than the nonlinear 
effects cause the deviation from constancy. The scahng region (2) has a very 
gradual slope with a fast convergence to the value 1.6 with the error less 
than 0.005. The saturation is already reached for the embedding dimension 
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m = 4, unlike the scaling region (1) when the "saturation" is reached be- 
tween m = 46 and 50 (see Figures 3a and 3b). The quotation marks denote 
that we are not sure whether further increasing of the embedding dimension 
would be accompanied by increasing of the slope! 

In spite of the fact that the range of the smallest lu r is biased by fluc- 
tuations, an additional region appears there for the embedding dimension 
greater than 20. This "scaling region" is divided into two parts (3) and (4) 
for the embedding dimension greater than 30. The slope of the lower region 
(3) decreases and the slope of the upper region increases as the embedding di- 
mension increases. The slopes of the scaling regions (3) and (4) are 16.1 ±0.8 
and 2.7 ± 0.1, respectively, for the embedding dimension m = 50. These sca- 
ling regions do not saturate their slope for m < 50 and they are not so clearly 
expressed as the scaling regions (1) and (2). For this moment let us make 
do with identification of the two clearly linear (see correlation coefficient in 
Table 2) scaling regions (1) and (2) with the slopes 17.5 and 1.6, respectively. 

5.2.2 Filtered series 

We could suspect that the scahng region (2) with the slope 1.6 is related to 
seasonal variations of temperature. In order to minimize the effect of seasonal 
changes of temperature a mean temperature is computed for each day over 
the original series and these daily means are subtracted from each daily value 
of temperature. For instance, the mean temperature T^^^q "^^^^ of 15 June is 
computed by temperature averaging over the all 15 Junes and then this mean 
is subtracted from temperature of each 15 June. The filtered series is obtained 
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in this way. 

One can see the hnear part (1) in the InC-lnr diagram in Figure 4 with 
a fast slope. This scahng region probably corresponds to the scahng region 
(1) of the original series. The slope "converges" to the value of 21.4 ± 0.4 for 
the embedding dimension m between 46 and 50 (see Figure 5). There are also 
the scaling regions (3) and (4) for smaller Inr. They arise for m > 30, that 
means for the higher value of the embedding dimension than in the case of 
the original series. The slopes of these scaling regions do not saturate for the 
embedding m < 50. The slope of the scaling region (4) decreases (for m = 50 
is equal to 4.1) and the slope of scaling region (3) increases (for m = 50 is 
equal to 13.9) as the embedding dimension increases. 

The scaling region with the gradual slope in the range of larger Inr is 
completely missing. This supports our hypothesis that the scaling region (2) 
in the correlation integrals of the original series is consequence of the seasonal 
course of temperature. 

5.2.3 Cut series 

Both the original and the filtered time series are divided into the fifteen 15- 

year-long consecutive intervals. The length of any cut series is 5478 or 5479 
data points. 

Dependence In C"^ on In r is qualitatively the same for all thirty cut series 
as well as for the series from which they are created. The fast convergence 
of the slope to the value of 1.6 is registered in the region of larger Inr for 
any cut series of the original one. The slope of the scaling region which was 
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denoted as (1) in the case of the original and filtered series converges to 
the values between 8.5 and 10.4 for the embedding dimension between 22 
and 38 according to the selection of 15-year interval. This is considerable 
underestimation of the slope in comparision with the original series. 

The same undervaluation can be observed for any cut scries of the filtered 
one. The convergence of the slope of the scaling region (1) to the values 
between 14.3 and 16.2 is reached for the embedding dimension between 34 
and 38 according to the selection of 15-year interval. The series of the years 
1835-1849, 1865-1879, 1880-1894, 1895-1909 and 1975-1989 are exceptions 
because no clearly saturated values of the slopes are reached and the slopes 
are greater than 17.1 for the embedding dimension m = 50. 

Therefore we conclude that a 15-year-long series of the mean daily tem- 
perature of air is not adequate to estimate a high fractal dimension based on 
the Grassberger & Procaccia algorithm. This is strongly marked for the origi- 
nal series. In this case the obtained values of the fractal dimension (between 
8.5 and 10.4) are very close to estimates of other authors [19], [38] who have 
used shorter series than ours. However, one has to keep in mind that their 
series have not been always the temperature ones (see the aforementioned 
Lorenz's experiment [21]). 

We have generated a time series by means of gaussian random process in 
order to exclude an artificial saturation of the slope of the scaling regions (1) 
due to insufficient number of the data in the case of the full time series. The 
length of this tentative series was 78527 data points, the mean was zero and 
the variance was 3.7. These parameters are close to the ones of the filtered 
series. For the random series no saturation has been observed even if the 
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calculation has been carried out to the embedding dimension m = 80. 
5.2.4 Seasonal series 

Seasonally adjusted series are created from the original one. They are the 
spring one (1 March - 31 May), the summer one (1 June - 31 August), the 
autumn one (1 September - 30 November) and the winter one (1 December 
- 29 February). 

The hnear parts (1) are clearly evident in the lnC"^-lnr diagrams for all 
seasonal series (see Figure 6). The slopes of these scaling regions are again 
large and converge as the embedding dimension increases. The "saturation" 
values are 16.4 ± 0.3? for the spring series, 20.3 ± 0.3? for the summer series, 
15.9 ± 0.3 for the autumn series and 17.7 ± 0.3? for the winter series. The 
question marks after the values indicate uncertainty of the saturation. 

The scaling regions (2) appear above the scaling regions (1) only in the 
case of the spring and autumn series. Their slopes converge towards the value 
of 4.5 ± 0.1 for the spring series and towards the value of 3.8 with the error 
less than 0.01 for the autumn series. No similar scaling regions are detectable 
in the summer and winter series. 

The existence of the scaling regions (2) in spring and autumn supports 
the above mentioned hyphothesis that the gradual slope in the correlation 
integrals in the range of large In r is created by seasonal variability of weather. 
Moreover, it is not obviously simple annual course because of their absence 
in the case of summer and winter. 
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5.2.5 Differenced series 

For a system governed by low-dimensional dynamics, the fractal dimension is 
the same for the original signal as well as for its first difference. However, the 
first difference of stochastic signal has often much larger value of the fractal 
dimension than the original signal [31]. The first time-differenced series is 
created from the original one in order to exclude or confirm the deterministic 
character of the data. 

From Figure 8a one can see one and only scaling region in the correlation 
integrals. This is quite different dependence than in the earlier cases. The 
slopes of the scahng region oscillate at first between 19.4 and 20.3 for the 
embedding dimension between 34 and 42, and later between 20.5 and 26 for 
the embedding dimension between 50 and 60 (see Figure 8b). The value of 
the fractal dimension acquired from the differenced series is closer to the one 
obtained from the filtered series than from the original series (see Table 2). 

One can immediately exclude coloured noise with high confidence owing 
to the course of the autocorrelation functions (see Figure la and Ic). The 
autocorrelation function of the coloured noise series decreases very slowly 
and in the case of unlimited number of data points never reaches the value 
of zero [41]. The autocorrelation function of the differenced series looks like 
the one of white-noise. 

Although random fluctuations are present in the temperature series we 
suppose that they are not responsible for the origin of the scaling regions (1) 
from which the fractal dimension values of 17.5 and 21.4 are estimated. We 
conclude that from the above mentioned experiment with gaussian random 
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process. We presume that one meets high-dimensional dynamics and that 
the oscillations of the correlation integral slopes between 19.4 and 20.3 and 
then around the value of 22.5 are caused by presence of inherent noise. It 
is well-known ([1], [28], [31]) that differencing acts as an amphfier of noisy 
components but weakens low-frequency components included in the original 
signal. In this way we explain the absence of another scaling region corre- 
sponding to a low value of the fractal dimension in the case of differenced 
series. Pavlos et al. [28] have observed similar low-dimensional chaotic com- 
ponent cut off due to differencing. 

5.2.6 Smoothed series 

The high-frequency oscillations can be removed by means of averaging of the 
original signal over an appropriate time interval. In this study 5-day moving 
averages arc applied. 

The correlation integrals computed from the scries smoothed in this way 
are shown in Figures 9a and 9b. The dependence is similar to the one for the 
original series (see Figures 2 and 9). However, the individual scaling regions 
are more evident than for the original series. This is particularly valid for the 
scaling regions (3) and (4). This phenomenon is easy to understand because 
random fluctuations reside in the range of smaller scales and the smoothing 
removes them. 

The values of the fractal dimension estimated from the scaling regions 
(1) and (2) are slightly smaller than the values from the non-smoothed series. 
We get the values 16.2 ± 0.3 and 1.5 with the error less than 0.004 from the 
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scaling regions (1) and (2), respectively. The decrease of the value of the 
fractal dimension for smoothed series has been observed by Pavlos et al. [28] 
as well. We assume that the decrease is due to high-frequency random noise. 
If an m-dimensional signal is reconstructed from one-dimensional random 
sequence then the cloud of m-dimcnsional points fills out completely the 
m-dimensional phase space. If the signal is composed of the deterministic 
and stochastic components then the stochastic one preserves its tendency to 
fill out the phase space. This causes larger smearing of the trajectories in 
the phase space and one then measures higher dimension than in the purely 
deterministic case. 

Let us pay attention to the scaling regions (3) and (4). The slopes of the 
scahng regions (3) and (4) oscillate between 14.8 and 17.7, and between 1.5 
and 1.7, respectively. These values are very close to the ones of the scaling 
regions (1) and (2) (see Figures 10a and 10b). That looks like the dynamics 
is duplicated in the smaller scales. Could it be possible that the scaling 
regions (3) and (4) are manifestation of additional subsystems which reside 
in very small scales and, therefore, they are not clearly detectable in the 
signal contaminated by high-frequency noise? However, it is quite possible 
that used algorithm introduces this "symmetry" artificially if it is applied on 
the extreme embedding dimension. Many experiments have to be carried out 
in order to confirm one of these hypotheses or another different one. 
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5.3 Estimating the entropy of the chmate attractor 

The ii'2-6ntropy is estimated for the original, filtered, smoothed and all sea- 
sonally adjusted series. The formula ( [T3| ) is again used instead of the correla- 
tion integral (P). In order to reduce fluctuations and improve the statistics the 
formula (|T^) is averaged over five values computed for different embedding 
dimensions. This yields 

Kl^{r) = ^y ^\Yi \ y\ L = 5. (19) 

Dependence of on the embedding dimension m is approximated by means 
of least-squares fit by the function 

/(x)=a + ^. (20) 

This function converges to a for c > and x — ^ oo. Therefore 

K^{r) = K,{r) + A, ^ > 0. (21) 

real parameters, m is the embedding dimension and K2{r) is the 
entropy which depends on the selection of the scaling region. The formula 
(pTD describes the dependence of on m very appropriately (see Figures 
11, 12, 13 and Table 3). 

The value of an error-doubling time is more illustrative of the predicta- 
bility of the atmosphere than the value of the i^2-entropy. The error- doubling 
time is defined as follows 

The estimates given in Table 3 are obtained from the scaling regions (1) 
and (2) of the correlation integrals. It has not been possible to acquire reliable 
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estimates from the scaling regions (3) and (4) owing to large oscillations of 
KJ]^ in the range of the smallest Inr. 

Let us introduce results of the estimates from the scaling regions (1) first. 
The error- doubling times of the systems reconstructed from the original and 
filtered series are 4.7±0.9 days and 5.9 ±0.9 days, respectively. These values, 
especially the latter one, are very close to the time scale of the synoptic 
fluctuations. 

The transition-seasons (spring and autumn) exhibit increase of the value 
of the ii'2-entropy and decrease of the value of the error-doubling time (T2 = 
4.0 ± 0.4 days for spring and T2 = 3.6 ± 0.4 for autumn). The greater values 
of the error-doubling time are obtained for the summer and winter seasons 
(4.7 ± 0.4 days for summer and 4.7 ± 0.6 days for winter). These values 
are the same as for the entire year. Therefore the most difficult prediction 
should be expected in spring and autumn. This should not be surprising 
if one remembers changeable weather during these transition-seasons which 
is connected with the change of general circulation from summer to winter 
and vice versa. The estimate from the smoothed series yields the value of 
T2 = 6.1 ±0.3 days. 

As mentioned above, the scaling regions (2) with the gradual slope are 
revealed in the ranges of larger In r in the original, smoothed, spring and au- 
tumn series. The results of the estimates of the i^'2-entropy from these regions 
are shown in Figures 14 and 15 and in Table 3. The values for spring are not 
involved because a "plateau" in the K^-\nr diagram is clearly detectable 
only for the embedding dimensions greater than 52 and the extrapolation 
by means of the formula ( pTD is unreliable. The error-doubling times com- 
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puted from the scaling regions (2) are surprisingly high even if the values 
of KJ^ for m = 50 are not so high (28.1 days for the entire year, 27.0 for 
the smoothed series, 15.5 for autumn). However, the formula (|21|) gives the 
relatively extreme estimates which need further verification. 

For this purpose, it is possible to use the maximum norm instead of the 
Euclidean one. Improvement of the iir2-entropy calculation can be reached 
by means of dimension scaled distances technique [13]. In this way, one could 
avoid the extrapolation. Further possibility is to utilize another extrapolation 
formula instead of (0). Formula 

which has been proposed by Frank et al. [13] should be suitable. However, 
the formula needs some modification in our case because the averaging is 



introduced by means of (plQ]). Therefore one has 
instead of (p^). 

Some of the aforementioned improvements of the i^'2-entropy calculation 
will be carried out in future. 



6 Discussion and conclusions 

Some ideas from nonlinear time series analysis have been used to study the 
dimensionality and the predictability of weather system. The analysis of the 
series of mean daily temperature of air and the series generated by it has been 
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carried out. The fractal dimension and the i^2-entropy estimates have been 
performed by means of modified algorithms of Grassberger & Procaccia. Since 
a careless selection of the scaling region in the In C""-ln r diagram may often 
lead to erroneous conclusions the attempt to objectify the choice has been 
introduced by means of maximization of the formula (0) in every embedding 
dimension. The formula (plD has been utilized for the extrapolation of Kl^ 
for m oo and formula ( p^ has been devised for the same aim. 

The series has exhibited multiscaled character. Several scaling regions 
of the correlation integrals have been revealed. The first one possessed the 
gradual slope in the range of larger Inr. The second one possessed the fast 
slope in the range of smaller Inr. This arrangement can be explained accord- 
ing to Eckman & Ruelle [8] as a product of two dynamical subsystems A 
and B. Suppose that the subsystems A and B are noninteracting or that the 
subsystem B evolves independently of A, but the evolution of A may depend 
on B. Let Xa and Xb be the signals of the subsystems A and B, respectively. 
Suppose that the amplitude of the first signal X^ is less than that of the 
signal Xb- Then one has an information about the complete system {A + B) 
in the region Inr < Inr^. In the region Inr ^ Inr^ one has an information 
about the subsystem B only. The "knee" in the correlation integral may also 
result from a purely stochastic process [31] but this is not our case. Other two 
"scaling" regions have been found in the range of smallest Inr. They have 
been most visible for the smoothed series. Our preliminary results indicate 
that this regions are due to the temporal correlations between nearby points 
of the series. The detailed calculation will be published later (see also [45]). 

We conclude that the temperature series has been formed as a result 
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of activity of at least two subsystems. One subsystem has the fractal di- 
mension greater than 17, the other one is low-dimensional {df — 1.6). The 
error-doubling time of the high-dimensional subsystem is comparable with 
the time scale of the synoptic fluctuations (from four to six days) . The low- 
dimcnsional subsystem exhibits much larger potential predictability. In spite 
of some problems in the i^r2-entropy estimation and the surprisingly high 
obtained values of the error-doubling time, we suggest that the value of 
T2 for the low-dimensional subsystem is of the order of a few weeks. The 
low-dimensional subsystem is probably related to the seasonal variability of 
weather because it was not detectable in the series without annual course 
and in some of seasonally adjusted series. 

The values of the ii'2-entropy of the high-dimensional subsystem demon- 
strate lower predictability time in transition-seasons (spring and autumn). 

We conclude that 15-year-long mean daily temperature series are not 
sufficient for the reliable estimation of a high fractal dimension based on 
the Grassberger & Procaccia algorithm. The values of the fractal dimension 
of the high-dimensional subsystem estimated from the cut series have been 
reduced roughly by half in comparison with the full time series. 

The temperature series has not purely deterministic character. This has 
been demonstrated by decreasing of the fractal dimension and the K2-entropY 
for the smoothed series and also by increasing of them for the differenced 
series, in comparison with the original one. Further tests should be carried 
out in order to distinguish precisely between the deterministic component 
and noise. 

The estimates of the dimension and the entropy are only some of many 
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steps which ought to lead to more complete understanding of the dynamics 
of the atmospheric processes. Our attempt should continue by calculation of 
other invariants. It is desirable to carry out the estimates from series of other 
meteorological elements. Observations from different meteorological stations 
should be used in order to remove local impacts. The research ought to culmi- 
nate in attempt at nonlinear prediction [5], [11], [36], [40]. Here an interesting 
question arises in relation to the low-dimensional subsystem with higher po- 
tential predictability. Namely, is a successful long-term prediction possible if 
the high-dimensional subsystem is excluded from the signal? Moreover, it is 
not clear whether (and how much) such prediction would be different from 
a climatological normal. 
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Series 


lie 


1/10 





original 


65 


85 


92 


filtered 


6 


12-18 


22-185 


smoothed 


66 


85 


92 


sprinf] 


9 


17 


21 


summer 


4 


13 


32 


autumn 


10 


17 


21 


winter 


7 


25 


52 


differenced 


1 


1 


1 



Tab. 1: The lag (day) at which the autocorrelation function attains the values of 1/e, 
1/10 and 0, for selected series. 
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SERIES 


N 


scaling region 


Clj 


correlation coefficient 


original 


78527 


(1) 


? 


17.5±0.3 


0.99929 






(2) 




1 6±0 005 


0.99998 


filtered 


78527 


(1) 


? 


21.4±0.4 


0.99937 


smoothed 


78523 


(1) 


? 


16.2±0.3 


0.99934 






(2) 




1 5±0 004 


0.99999 






(3) 




14.8-17.7 


0.98393 






(4) 




1.5-1.7 


0.99941 


spring 


19780 


(1) 


? 


16.4±0.3 


0.99922 






(2) 




4.5±0.1 


0.99909 


summer 


19780 


(1) 


? 


20.3±0.4 


0.99941 


autumn 


19565 


(1) 




15.9±0.3 


0.99914 






(2) 




3.8±0.01 


0.99996 


winter 


19402 


(1) 


? 


17.7±0.3 


0.99957 


differenced 


78526 




20.5-26.0 


0.99989 



Tab. 2: The estimates of the fractal dimension df for analysed series. Error represents 
the 90 % confidence limits of the least-squares fit. The question mark indicates uncertainty 
of saturation. 
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SERIES 


sealing region 




To 


e 


eorrelation eoeffieient 


original 


(1) 


0.14658±0.03229 


4.7±0.9 


1.50 


0.99138 




(2) 


0.00535±0. 00221 


129.5±37.8 


1.02 


0.98945 


filtered 


(1) 


0.11758±0.02162 


5.9±0.9 


1.10 


0.99539 


smoothed 


(1) 


0.10909±0.00501 


6.4±0.3 


1.13 


0.99919 




(2) 


0.00240±0.00053 


288.3±51.7 


0.97 


0.99960 


spring 


(1) 


0.17358±0.01687 


4.0±0.4 


1.61 


0.99272 


summer 


(1) 


0.14664±0. 01400 


4.7±0.4 


1.34 


0.99850 


autumn 


(1) 


0.19495±0.02143 


3.6±0.4 


1.85 


0.99532 




(2) 


0.00892±0.00168 


77.7±12.4 


1.15 


0.99651 


winter 


(1) 


0.14717±0.02322 


4.7±0.6 


1.45 


0.99566 



Tab. 3: The estimates of the ii'2-entropy (1/day) and the error-doubling time T2 (day) for analysed series, 
c is coefficient used for regression. Error represents the 90 % confidence limits of the least-squares fit. 
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Figures 



Fig. 1 db. Autocorrelation function of the original series (a) and the smoothed series (b). 



Fig. lb. Autocorrelation function of the "seasonally adjusted" series: spring (a), summer 
(b), autumn (c), winter (d). 



Fig. Ic. Autocorrelation function of the filtered series (a) and the differenced series (b). 



Fig. 2a. In C™ — In r diagram for embedding dimensions m = 2, 6, 26 (ordered from 
left to right) of the original series. 



Fig. 2b . In C™ — In r diagram for embedding dimensions m = 30, 34, ...,50 (ordered from 
left to right) of the original series. 



Fig. 3a. Plot of fractal dimension J as a function of embedding dimension m for the 
scaling region (1) of the original series. 



Fig. 3b. Plot of fractal dimension J as a function of embedding dimension m for the 
scaling region (2) of the original series. 



Fig. 4a. In C™ — In r diagram for embedding dimensions m = 2, 6, 26 (ordered from 
left to right) of the filtered series. 



Fig. 4b. In C™ — In r diagram for embedding dimensions m = 30, 34, ...,50 (ordered from 
left to right) of the filtered series. 



Fig. 5. Plot of fractal dimension J as a function of embedding dimension m for the 
scaling region (1) of the filtered series. 



Fig. 6a. InC™ — Inr diagram for embedding dimensions m = 2, 10,20, ...,60 (ordered 
from left to right) of the spring series. 



Fig. 6b. InC™ — Inr diagram for embedding dimensions m = 2, 10,20, ...,60 (ordered 
from left to right) of the summer series. 



Fig. 6c. In C™ — Inr diagram for embedding dimensions m = 2, 10,20, ...,60 (ordered 
from left to right) of the autumn series. 



Fig. 6d. InC™ — Inr diagram for embedding dimensions m = 2, 10,20, ...,60 (ordered 
from left to right) of the winter series. 



Fig. 7a. Plot of fractal dimension J as a function of embedding dimension m for the 
scaling regions (1) and (2) of the spring series. 



Fig. 7b. Plot of fractal dimension J as a function of embedding dimension m for the 
scaling region (1) of the summer series. 



Fig. 7c. Plot of fractal dimension J as a function of embedding dimension m for the 
scaling regions (1) and (2) of the autumn series. 



Fig. 7d. Plot of fractal dimension J as a function of embedding dimension m for the 
scaling region (1) of the winter series. 



Fig. 8a. In C™ — Inr diagram for embedding dimensions m = 2, 10,20, ...,60 (ordered 
from left to right) of the differenced series. 



Fig. 8b. Plot of fractal dimension J as a function of embedding dimension m of the 
differenced series. 



Fig. 9a. In C™ — In r diagram for embedding dimensions m = 2, 6, 26 (ordered from 
left to right) of the smoothed series. 



Fig. 9b . In C™ — In r diagram for embedding dimensions m = 30, 34, ...,50 (ordered from 
left to right) of the smoothed series. 



Fig. 10a. Plot of fractal dimension J as a function of embedding dimension m for the 
scaling regions (1) and (3) of the smoothed series. 



. 10b: Plot of fractal dimension J as a function of embedding dimension m for the 
scaling regions (2) and (4) of the smoothed series. 



.11a. Plot of AT function of embedding dimension m for the scaling region 
(1) of the original series. 



. lib. Plot of AT function of embedding dimension m for the scaling region 
(1) of the filtered series. 



. 12a. Plot of AT function of embedding dimension m for the scaling region 
(1) of the spring series. 



. 12b. Plot of AT function of embedding dimension m for the scaling region 
(1) of the summer series. 



. 13a. Plot of AT function of embedding dimension m for the scaling region 
(1) of the autumn series. 



13b. Plot of AT function of embedding dimension m for the scaling region 
1) of the winter series. 



14a. Plot of AT function of embedding dimension m for the scaling region 
2) of the original series. 



14b. Plot of AT function of embedding dimension m for the scaling region 
2) of the autumn series. 



15a. Plot of AT function of embedding dimension m for the scaling region 
1) of the smoothed series. 



15b. Plot of AT function of embedding dimension m for the scaling region 
2) of the smoothed series. 



Fig. la 




□ 



lOO 200 300 
step (day) — 



400 




□ 



lOO 200 300 
step (day) — 



400 



Fig. lb 





Fig. 2a 




Fig. 3a 




Fig. 3b 




Fig. 4a 




Fig. 5 




□ 12 3 4 5 
In r > 



Fig. 6a 




Fig. 6b 




□ 12 3 4 5 
In r > 



Fig. 6c 




Fig. 6d 




□ 10 20 30 40 50 60 



Fig. 7a 




□ 10 20 30 43 53 63 



Fig. 7b 




□ 10 20 30 40 50 60 



Fig. 7c 




□ 10 20 30 43 53 63 



Fig. 7d 




Fig. 8b 




□ 12 3 4 5 
In r > 



Fig. 9a 




Fig. 9b 





Fig. 10b 





Fig. lib 





Fig. 12b 





Fig. 13b 




□ 10 20 30 40 50 60 
m / 



Fig. 14a 




Fig. 14b 





Fig. 15b 



